Regular Policies in Abstract Dynamic Programming

نویسنده

  • Dimitri P. Bertsekas
چکیده

We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of regular policies. These are policies that are well-behaved with respect to value and policy iteration, and are patterned after proper policies, which are central in the theory of stochastic shortest path problems. We show that the optimal cost function over regular policies may have favorable value and policy iteration properties, which the optimal cost function over all policies need not have. We accordingly develop a unifying methodology to address long standing analytical and algorithmic issues in broad classes of undiscounted models, including stochastic and minimax shortest path problems, as well as positive cost, negative cost, risk-sensitive, and multiplicative cost problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling and Decision-making on Deteriorating Production Systems using Stochastic Dynamic Programming Approach

This study aimed at presenting a method for formulating optimal production, repair and replacement policies. The system was based on the production rate of defective parts and machine repairs and then was set up to optimize maintenance activities and related costs. The machine is either repaired or replaced. The machine is changed completely in the replacement process, but the productio...

متن کامل

A Dependent Type Theory for Verification of Information Flow and Access Control Policies

We present Relational Hoare Type Theory (RHTT), a novel language and verification system capable of expressing and verifying rich information flow and access control policies via dependent types. We show that a number of security policies which have been formalized separately in the literature can all be expressed in RHTT using only standard type-theoretic constructions such as monads, higher-o...

متن کامل

Integration of dynamic pricing and overselling with opportunistic cancellation

Abstract   We extend the concept of dynamic pricing by integrating it with “overselling with opportunistic cancellation” option, within the framework of dynamic policy. Under this strategy, to sell a stock of perishable product (or capacity) two prices are offered to customers at any given time period. Customers are categorized as high-paying and low-paying ones. The seller deliberately oversel...

متن کامل

An Optimal Tax Relief Policy with Aligning Markov Chain and Dynamic Programming Approach

Abstract In this paper, Markov chain and dynamic programming were used to represent a suitable pattern for tax relief and tax evasion decrease based on tax earnings in Iran from 2005 to 2009. Results, by applying this model, showed that tax evasion were 6714 billion Rials**. With 4% relief to tax payers and by calculating present value of the received tax, it was reduced to 3108 billion Rials. ...

متن کامل

Regular Policies in Stochastic Optimal Control and Abstract Dynamic Programming

Notation Connection with Abstract DPMapping of a stationary policy μ: For any control function μ, with μ(x) ∈ U(x) forall x , and J ∈ E(X ) define the mapping Tμ : E(X ) 7→ E(X ) by(TμJ)(x) = E{g(x , μ(x),w) + αJ(f (x , μ(x),w))}, x ∈ XValue Iteration mapping: For any J ∈ E(X ) define the mapping T : E(X ) 7→ E(X )(TJ)(x) = infu∈U(x)E{...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • SIAM Journal on Optimization

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2017